66 research outputs found
Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection
Top-down saliency models produce a probability map that peaks at target
locations specified by a task/goal such as object detection. They are usually
trained in a fully supervised setting involving pixel-level annotations of
objects. We propose a weakly supervised top-down saliency framework using only
binary labels that indicate the presence/absence of an object in an image.
First, the probabilistic contribution of each image region to the confidence of
a CNN-based image classifier is computed through a backtracking strategy to
produce top-down saliency. From a set of saliency maps of an image produced by
fast bottom-up saliency approaches, we select the best saliency map suitable
for the top-down task. The selected bottom-up saliency map is combined with the
top-down saliency map. Features having high combined saliency are used to train
a linear SVM classifier to estimate feature saliency. This is integrated with
combined saliency and further refined through a multi-scale
superpixel-averaging of saliency map. We evaluate the performance of the
proposed weakly supervised topdown saliency and achieve comparable performance
with fully supervised approaches. Experiments are carried out on seven
challenging datasets and quantitative results are compared with 40 closely
related approaches across 4 different applications.Comment: 14 pages, 7 figure
Are object detection assessment criteria ready for maritime computer vision?
Maritime vessels equipped with visible and infrared cameras can complement
other conventional sensors for object detection. However, application of
computer vision techniques in maritime domain received attention only recently.
The maritime environment offers its own unique requirements and challenges.
Assessment of the quality of detections is a fundamental need in computer
vision. However, the conventional assessment metrics suitable for usual object
detection are deficient in the maritime setting. Thus, a large body of related
work in computer vision appears inapplicable to the maritime setting at the
first sight. We discuss the problem of defining assessment metrics suitable for
maritime computer vision. We consider new bottom edge proximity metrics as
assessment metrics for maritime computer vision. These metrics indicate that
existing computer vision approaches are indeed promising for maritime computer
vision and can play a foundational role in the emerging field of maritime
computer vision
Towards Balanced Active Learning for Multimodal Classification
Training multimodal networks requires a vast amount of data due to their
larger parameter space compared to unimodal networks. Active learning is a
widely used technique for reducing data annotation costs by selecting only
those samples that could contribute to improving model performance. However,
current active learning strategies are mostly designed for unimodal tasks, and
when applied to multimodal data, they often result in biased sample selection
from the dominant modality. This unfairness hinders balanced multimodal
learning, which is crucial for achieving optimal performance. To address this
issue, we propose three guidelines for designing a more balanced multimodal
active learning strategy. Following these guidelines, a novel approach is
proposed to achieve more fair data selection by modulating the gradient
embedding with the dominance degree among modalities. Our studies demonstrate
that the proposed method achieves more balanced multimodal learning by avoiding
greedy sample selection from the dominant modality. Our approach outperforms
existing active learning strategies on a variety of multimodal classification
tasks. Overall, our work highlights the importance of balancing sample
selection in multimodal active learning and provides a practical solution for
achieving more balanced active learning for multimodal classification.Comment: 12 pages, accepted by ACMMM 202
A Unified Framework for Guiding Generative AI with Wireless Perception in Resource Constrained Mobile Edge Networks
With the significant advancements in artificial intelligence (AI)
technologies and powerful computational capabilities, generative AI (GAI) has
become a pivotal digital content generation technique for offering superior
digital services. However, directing GAI towards desired outputs still suffer
the inherent instability of the AI model. In this paper, we design a novel
framework that utilizes wireless perception to guide GAI (WiPe-GAI) for
providing digital content generation service, i.e., AI-generated content
(AIGC), in resource-constrained mobile edge networks. Specifically, we first
propose a new sequential multi-scale perception (SMSP) algorithm to predict
user skeleton based on the channel state information (CSI) extracted from
wireless signals. This prediction then guides GAI to provide users with AIGC,
such as virtual character generation. To ensure the efficient operation of the
proposed framework in resource constrained networks, we further design a
pricing-based incentive mechanism and introduce a diffusion model based
approach to generate an optimal pricing strategy for the service provisioning.
The strategy maximizes the user's utility while enhancing the participation of
the virtual service provider (VSP) in AIGC provision. The experimental results
demonstrate the effectiveness of the designed framework in terms of skeleton
prediction and optimal pricing strategy generation comparing with other
existing solutions
- …